THUIR at TREC 2009 Web Track: Finding Relevant and Diverse Results for Large Scale Web Search
نویسندگان
چکیده
This is the 8th year that IR group of Tsinghua University (THUIR) participates in TREC. This year we focus on Web track, which contains two tasks, namely ad hoc and diversity. On ad hoc task, we improved the efficiency of our distributed retrieval system TMiner to handle terabytes of Web data. Then three studies have been done, namely page quality estimation, ranking feature analysis, and model comparison. On diversity task, we proposed several new approaches on searching strategy, user intention detection, and duplication elimination. To mine user‟s intention, we proposed and compared two different strategies, namely „searching + content-based diversity‟ which is a kind of result clustering, and „user based diverse intention prediction + searching‟ which is in the branch of query expansion.
منابع مشابه
Finding "Abstract Fields" of Web Pages and Query Specific Retrieval - THUIR at TREC 2004 Web Track
متن کامل
THUIR at TREC 2008: Relevance Feedback Track
Tsinghua University Information Retrieval Group (THUIR) has participated into the first Relevance Feedback Track of TREC2008. The TMiner search engine has been used as our text retrieval system, because the processing capability and flexibility of this system on large text data has been testified during many years’ Web Track and Terabyte Track. In the track, we studied two approaches: 1) query ...
متن کاملTHUIR at TREC 2003: Novelty, Robust and Web
This is the second time that Tsinghua University Information Retrieval Group (THUIR) participates in TREC. In this year, we took part in four tracks: novelty, robust, web and HARD, describing in following sections, respectively. A new IR system named TMiner has been built on which all experiments have been performed. In the system, Primary Feature Model (PFM) has been proposed and combined with...
متن کاملImproved Feature Selection and Redundance Computing - THUIR at TREC 2004 Novelty Track
This is the third years that Tsinghua University Information Retrieval Group (THUIR) participates in Novelty task of TREC. Our research on this year’s novelty track mainly focused on four aspects: (1) text feature selection and reduction; (2) improved sentence classification in finding relevant information; (3)efficient sentence redundancy computing; (4) effective result filtering. All experime...
متن کاملTHUIR at TREC 2008: Enterprise Track
We participate in document search and expert search of Enterprise Track in TREC2008. The corpus and tasks are same as the year before. Different from TREC 2007, the topics come from CSIRO Enquiries, and the topic statements are richer and more colloquial.. In document search, we look into the key resource page pre-selection, the use of anchor text, query classification, and multi-field search. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009